Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggested clarification to whiteouts #1215

Merged
merged 1 commit into from
Nov 6, 2024
Merged

Conversation

aconz2
Copy link
Contributor

@aconz2 aconz2 commented Oct 30, 2024

I found it confusing that the first example given for the whiteouts section was only about opaque whiteouts even though that gets its own section. So I added a simple example with just a file and directory being deleted.

The second commit fixes what I believe is a typo in the opaque whiteout example, where it says that a/b is deleted, but I think it is a/ that is deleted.

Copy link
Member

@tianon tianon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly LGTM, one minor wording nit 👍

layer.md Outdated Show resolved Hide resolved
tianon
tianon previously approved these changes Oct 30, 2024
Copy link
Contributor

@sudo-bmitch sudo-bmitch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My reading of the opaque white out is different:

An opaque whiteout entry is a file with the name .wh..wh..opq indicating that all siblings are hidden in the lower layer.

Which means that a itself is not deleted, only the contents of the directory. I verified that with the following example:

$ regctl manifest get sudobmitch/demo:opaque-wh
Name:        sudobmitch/demo:opaque-wh
MediaType:   application/vnd.docker.distribution.manifest.v2+json
Digest:      sha256:9310089ee7088013005a78fdf549aac248c873c37c30559de2ee687710ca5b6c
Total Size:  2.153MB

Config:
  Digest:    sha256:ef5cf31c087d8b2178e189770e123dfbd104d03dc6be72a44a49e0efa708137c
  MediaType: application/vnd.docker.container.image.v1+json
  Size:      841B

Layers:

  Digest:    sha256:ec562eabd705d25bfea8c8d79e4610775e375524af00552fe871d3338261563c
  MediaType: application/vnd.docker.image.rootfs.diff.tar.gzip
  Size:      2.153MB

  Digest:    sha256:2d6473f09615f3446a3dd09cdf3d554e8080c43e8946e007c543df8feb759ee9
  MediaType: application/vnd.docker.image.rootfs.diff.tar.gzip
  Size:      222B

  Digest:    sha256:77deb7c636039bbea6511a6de67dc0fb549ba3ee9c0004bd82c986ccf4773dd9
  MediaType: application/vnd.docker.image.rootfs.diff.tar.gzip
  Size:      121B

$ regctl blob get sudobmitch/demo:opaque-wh sha256:2d6473f09615f3446a3dd09cdf3d554e8080c43e8946e007c543df8feb759ee9 | tar -tvzf -
drwxr-xr-x 0/0               0 2024-11-01 10:39 etc/
drwxr-xr-x 0/0               0 2024-11-01 10:39 proc/
drwxr-xr-x 0/0               0 2024-11-01 10:39 sys/
drwxr-xr-x 0/0               0 2024-11-01 10:39 usr/
drwxr-xr-x 0/0               0 2024-11-01 10:39 usr/local/
drwxr-xr-x 0/0               0 2024-11-01 10:39 usr/local/bin/
drwxr-xr-x 0/0               0 2024-11-01 10:39 usr/local/tmp/
-rw-r--r-- 0/0              12 2024-11-01 10:39 usr/local/tmp/hello.txt

$ regctl blob get sudobmitch/demo:opaque-wh sha256:77deb7c636039bbea6511a6de67dc0fb549ba3ee9c0004bd82c986ccf4773dd9 | tar -tvzf -
-rw-r--r-- 0/0               0 2024-11-01 10:42 usr/local/.wh..wh..opq

$ docker run -it --rm sudobmitch/demo:opaque-wh ls -al /usr/local
total 8
drwxr-xr-x    2 root     root          4096 Nov  1 14:50 .
drwxr-xr-x    1 root     root          4096 Nov  1 14:50 ..

layer.md Outdated
a/b/c/bar
```

When the next layer is created, the original `a/` directory is deleted and recreated with `a/b/c/foo`:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The a directory itself is not deleted, only all entries in the a directory, if my reading is correct. To make it more clear, we could also add an a/baz that also gets deleted.

Suggested change
When the next layer is created, the original `a/` directory is deleted and recreated with `a/b/c/foo`:
When the next layer is created, the original `a/b` directory is deleted and recreated with `a/b/c/foo`:

layer.md Outdated
a/b/c/foo
```

When processing the second layer, `a/.wh..wh..opq` is applied first, before creating the new version of `a/`, regardless of the ordering in which the whiteout file was encountered.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
When processing the second layer, `a/.wh..wh..opq` is applied first, before creating the new version of `a/`, regardless of the ordering in which the whiteout file was encountered.
When processing the second layer, `a/.wh..wh..opq` is applied first, before creating the new version of `a/b`, regardless of the ordering in which the whiteout file was encountered.

layer.md Show resolved Hide resolved
layer.md Outdated Show resolved Hide resolved
layer.md Show resolved Hide resolved
@aconz2
Copy link
Contributor Author

aconz2 commented Nov 1, 2024

I tested the example from the spec (including a/baz), where Containerfile.1 does rm -rf a/* and Containerfile.2 does rm -rf a/. Containerfile.3 and 4 are for demonstration about just doing rm -rf a/* without recreating any files and 4 does the same as 1 in two steps. Containerfile.2 matches the example given in the spec.

I'm more of a podman user but tested with docker too just to see (I manually trimmed out a deprecated warning from docker for wanting to switch to buildkit). Podman works the same but it includes extra stuff like /{proc,run,sys} which I incidentally have asked about here.

The layers from 1 and 2 both represent equivalent end states given their shared base layer. But, in isolation as a changeset, they mean different things. If we imagine taking just those change layers and applying them in turn onto some other base layer that has a directory a/new, then applying the last layer of Containerfile.1 (rm -rf a/*) WILL show a/new, whereas applying the last layer of Containerfile.2 (rm -rf a/ WILL NOT show a/new.

And I just realized that these two lines

Using explicit whiteout files, this would be equivalent to the following:

Implementations SHOULD generate layers using explicit whiteout files, but MUST accept both.

are problematic because using explicit whiteouts on every directory entry is not equivalent to an opaque dir.

test script
set -e

engine="${1:-podman}"

function get_layers() {
    image_manifest=$(jq -r '.manifests[0].digest' index.json | sed 's_:_/_')
    jq -r '.layers[].digest' blobs/$image_manifest | sed 's_:_/_'
}

function show_layers() {
    rm -rf oci-dir && mkdir oci-dir
    pushd . &>/dev/null
    if [ "$engine" = "podman" ]; then
        podman save --format oci-archive $1 | tar xf - -C oci-dir
    else
        $engine save $1 | tar xf - -C oci-dir
    fi
    cd oci-dir
    for layer in $(get_layers | tail -n+2); do
        echo $layer
        tar tvf blobs/$layer
        echo
    done
    popd &>/dev/null
}

echo "using engine $engine"
mkdir -p /tmp/question
cd /tmp/question
name=opaque-dir

cat << "EOF" > Containerfile.1
FROM busybox
RUN mkdir -p a/b/c/ && touch a/b/c/bar a/baz
RUN rm -rf a/* && mkdir -p a/b/c && touch a/b/c/foo
EOF

cat << "EOF" > Containerfile.2
FROM busybox
RUN mkdir -p a/b/c/ && touch a/b/c/bar a/baz
RUN rm -rf a/ && mkdir -p a/b/c && touch a/b/c/foo
EOF

cat << "EOF" > Containerfile.3
FROM busybox
RUN mkdir -p a/b/c/ && touch a/b/c/bar a/baz
RUN rm -rf a/*
EOF

cat << "EOF" > Containerfile.4
FROM busybox
RUN mkdir -p a/b/c/ && touch a/b/c/bar a/baz
RUN rm -rf a/*
RUN mkdir -p a/b/c && touch a/b/c/foo
EOF

for containerfile in Containerfile.{1,2,3,4}; do
    echo "# $containerfile"
    cat $containerfile
    $engine rmi $name >/dev/null || true
    if [ "$engine" = "podman" ]; then
        $engine build --dns=none --no-hosts --no-hostname -f $containerfile -t $name >/dev/null
    else
        $engine build -f $containerfile -t $name . >/dev/null
    fi
    show_layers $name

    echo -e "---------------------------------\n"
done
using engine docker
# Containerfile.1
FROM busybox
RUN mkdir -p a/b/c/ && touch a/b/c/bar a/baz
RUN rm -rf a/* && mkdir -p a/b/c && touch a/b/c/foo

sha256/50c092ecfb6009ae45b750dc59124b3e65f8659f3f69879bfc8fc259b730a86f
drwxr-xr-x 0/0               0 2024-11-01 15:01 a/
drwxr-xr-x 0/0               0 2024-11-01 15:01 a/b/
drwxr-xr-x 0/0               0 2024-11-01 15:01 a/b/c/
-rw-r--r-- 0/0               0 2024-11-01 15:01 a/b/c/bar
-rw-r--r-- 0/0               0 2024-11-01 15:01 a/baz

sha256/dc1058f5c091386239ecfa7b4432dd15365f799d91121b43c242b7b4da3690d0
drwxr-xr-x 0/0               0 2024-11-01 15:01 a/
drwxr-xr-x 0/0               0 2024-11-01 15:01 a/b/
-rwxr-xr-x 0/0               0 1969-12-31 18:00 a/b/.wh..wh..opq
drwxr-xr-x 0/0               0 2024-11-01 15:01 a/b/c/
-rw-r--r-- 0/0               0 2024-11-01 15:01 a/b/c/foo
-rw------- 0/0               0 2024-11-01 15:01 a/.wh.baz

---------------------------------

# Containerfile.2
FROM busybox
RUN mkdir -p a/b/c/ && touch a/b/c/bar a/baz
RUN rm -rf a/ && mkdir -p a/b/c && touch a/b/c/foo

sha256/095283817030b7b6cd9a77912ba989ad960253e4cf6daa8f799c7945ebfbd02e
drwxr-xr-x 0/0               0 2024-11-01 15:01 a/
drwxr-xr-x 0/0               0 2024-11-01 15:01 a/b/
drwxr-xr-x 0/0               0 2024-11-01 15:01 a/b/c/
-rw-r--r-- 0/0               0 2024-11-01 15:01 a/b/c/bar
-rw-r--r-- 0/0               0 2024-11-01 15:01 a/baz

sha256/f6239847f89d82104c79172b0de09e71173896f2ef34250078f2b212ca927d00
drwxr-xr-x 0/0               0 2024-11-01 15:01 a/
-rwxr-xr-x 0/0               0 1969-12-31 18:00 a/.wh..wh..opq
drwxr-xr-x 0/0               0 2024-11-01 15:01 a/b/
drwxr-xr-x 0/0               0 2024-11-01 15:01 a/b/c/
-rw-r--r-- 0/0               0 2024-11-01 15:01 a/b/c/foo

---------------------------------

# Containerfile.3
FROM busybox
RUN mkdir -p a/b/c/ && touch a/b/c/bar a/baz
RUN rm -rf a/*

sha256/187aa901df4b81d8e94f0a870392d9d6b59d8bf0206acdfa78b3fea1d70ac9e0
drwxr-xr-x 0/0               0 2024-11-01 15:01 a/
drwxr-xr-x 0/0               0 2024-11-01 15:01 a/b/
drwxr-xr-x 0/0               0 2024-11-01 15:01 a/b/c/
-rw-r--r-- 0/0               0 2024-11-01 15:01 a/b/c/bar
-rw-r--r-- 0/0               0 2024-11-01 15:01 a/baz

sha256/3ed74fb3bfe4785cbbac5a522c327c5cd9c67037dfc7d94e2284702e89d99131
drwxr-xr-x 0/0               0 2024-11-01 15:01 a/
-rw------- 0/0               0 2024-11-01 15:01 a/.wh.b
-rw------- 0/0               0 2024-11-01 15:01 a/.wh.baz

---------------------------------

# Containerfile.4
FROM busybox
RUN mkdir -p a/b/c/ && touch a/b/c/bar a/baz
RUN rm -rf a/*
RUN mkdir -p a/b/c && touch a/b/c/foo

sha256/bf437df5740c74318d371585b1c04dc6dbb6618d1677cba0ca0528788d535ce4
drwxr-xr-x 0/0               0 2024-11-01 15:01 a/
drwxr-xr-x 0/0               0 2024-11-01 15:01 a/b/
drwxr-xr-x 0/0               0 2024-11-01 15:01 a/b/c/
-rw-r--r-- 0/0               0 2024-11-01 15:01 a/b/c/bar
-rw-r--r-- 0/0               0 2024-11-01 15:01 a/baz

sha256/9bbdde336d324eedadd7514c496ddd67a6536d8b7de4364b8bde908c48c52ae0
drwxr-xr-x 0/0               0 2024-11-01 15:01 a/
-rw------- 0/0               0 2024-11-01 15:01 a/.wh.b
-rw------- 0/0               0 2024-11-01 15:01 a/.wh.baz

sha256/23afa1d396f76bd18f1d91943ffb1f3a21477c32fc0d911017fef62746070a24
drwxr-xr-x 0/0               0 2024-11-01 15:01 a/
drwxr-xr-x 0/0               0 2024-11-01 15:01 a/b/
drwxr-xr-x 0/0               0 2024-11-01 15:01 a/b/c/
-rw-r--r-- 0/0               0 2024-11-01 15:01 a/b/c/foo

---------------------------------

@sudo-bmitch
Copy link
Contributor

And I just realized that these two lines

Using explicit whiteout files, this would be equivalent to the following:

Implementations SHOULD generate layers using explicit whiteout files, but MUST accept both.

are problematic because using explicit whiteouts on every directory entry is not equivalent to an opaque dir.

They are the same if you know all of the entries in the directory (which build tools often know). They are different if you either change the definition of opaque whiteouts to include the parent directory, or if layers are being rebased (possible, but not common).

How does podman handle: docker run -it --rm sudobmitch/demo:opaque-wh ls -al /usr/local? Is the directory there or not found? I'm more interested in that since the other examples recreate the parent folder that we are debating whether it would be deleted.

@aconz2
Copy link
Contributor Author

aconz2 commented Nov 1, 2024

How does podman handle: docker run -it --rm sudobmitch/demo:opaque-wh ls -al /usr/local

→ podman run -it --rm sudobmitch/demo:opaque-wh ls -al /usr/local
total 0
drwxr-xr-x    1 root     root             0 Nov  1 22:56 .
drwxr-xr-x    1 root     root            10 Nov  1 22:56 ..

I'm a bit confused by your example in how it relates to the example from the spec. Can you show the Containerfile?

@sudo-bmitch
Copy link
Contributor

How does podman handle: docker run -it --rm sudobmitch/demo:opaque-wh ls -al /usr/local

→ podman run -it --rm sudobmitch/demo:opaque-wh ls -al /usr/local
total 0
drwxr-xr-x    1 root     root             0 Nov  1 22:56 .
drwxr-xr-x    1 root     root            10 Nov  1 22:56 ..

I'm a bit confused by your example in how it relates to the example from the spec. Can you show the Containerfile?

That shows podman and containerd are consistent and the existing spec should not be changed. It proves that a/ is not deleted in the examples.

The blob was generated by hand because my attempt to create it with buildkit resulted in individual whiteout files as recommended by the spec.

Co-authored-by: Brandon Mitchell <[email protected]>
Co-authored-by: Tianon Gravi <[email protected]>
Signed-off-by: Andrew Consroe <[email protected]>
@aconz2
Copy link
Contributor Author

aconz2 commented Nov 4, 2024

Okay I think we've reached an impasse, I've backed out the change of a/b and included the suggested changes and squashed.

Copy link
Contributor

@sudo-bmitch sudo-bmitch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@sudo-bmitch
Copy link
Contributor

Okay I think we've reached an impasse, I've backed out the change of a/b and included the suggested changes and squashed.

FWIW, if you wanted to delete a/ and not all the contents of a/, that would be .wh.a.

Copy link
Member

@tianon tianon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (although I think it's amusing that in the moved example, the text says we deleted a/b but the actual example would be created by deleting and recreating a/ entirely, which I guess is what Brandon was getting at with his comment about .wh.b)

@tianon tianon merged commit 45c76ea into opencontainers:main Nov 6, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants